Characterizing the Shine-Dalgarno Motif: Probability Matrices and Weight Matrices

نویسندگان

  • Dennis Kibler
  • Steven Hampson
چکیده

Methods for identifying biologically significant k-mers by exhaustive evaluation (k ≤ 10) are applied to the pooled Upstream Regions (USR) of all 4289 E. coli ORFs. Instances of the ShineDalgarno (SD) site are readily identified using these methods. Using these motif instances as starting points, two motif representations and training methods, probability and weight matrices, are applied to characterize the complete SD motif. Despite using different representations and objective functions, both methods yield approximately the same motif characterization, providing evidence for the robustness of the result and the effectiveness of the methods. By these measures, about 1/4 of the ORFs have no better than random SD sites.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Listeria Monocytogenes La111 and Klebsiella Pneumoniae KCTC 2242: Shine-Dalgarno Sequences

Listeria monocytogenes can cause serious infection and recently, relapse of listeriosis has been reported in leukemia and colorectal cancer, and the patients with Klebsiella pneumoniae are at increased risk of colorectal cancer. Translation initiation codon recognition is basically mediated by Shine-Dalgarno (SD) and the anti-SD sequences at the small ribosomal RNA (ssu rRNA). In this research,...

متن کامل

Learning Weight Matrices for Identifying Regulatory Elements

The structure of DNA regulatory patterns is partially understood, revealing an indeterminacy in the base composition. The dominant approach for representing this intrinsic variability is probability matrices, although some have used IUPAC codes and restricted regular expression languages [1]. In general the goal is to identify patterns that are distinguished from the background, where the backg...

متن کامل

Refining Probability Motifs for the Discovery of Existing Patterns of DNA Bachelor Project

The aim of this project was to build a probability motif refining program. In the past this process has been both too computationally demanding and time consuming to be a feasible tool in the world of Bioinformatics. The notion is to take a file of DNA sequences and containing hidden motifs and apply a set of given position specific weight matrices to these sequences in order to discover the in...

متن کامل

Evaluating Representations for the Shine-Dalgarno Site in Escherichia coli

Several methods for identifying individual motif instance by exhaustive evaluation of k-mers (k ≤ 10) are applied to the pooled Upstream Regions (USR) of all 4289 Escherichia coli ORFs. Instances of the Shine-Dalgarno (SD) site are readily identified using these methods. Using these motif instances as starting points, various motif representations and training methods, including several new alg...

متن کامل

امید ریاضی نرخ پوشش برای ماتریس‌های هلمن

Hellman’s time-memory trade-off is a probabilistic method for inverting one-way functions, using pre-computed data. Hellman introduced this method in 1980 and obtained a lower bound for the success probability of his algorithm.  After that, all further analyses of researchers are based on this lower bound. In this paper, we first studied the expected coverage rate (ECR) of the Hellman matrice...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002